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Abstract 


This  final  report  summarizes  the  principal  results  of  a 
project  concerned  with  how  readers  identify  the  important  content 
in  technical  prose.  The  theoretical  framework  for  this  process  is 
that  the  important  content  of  a  passage  is  constructed  by  the 
reader  based  on  the  semantic  content  of  the  passage  together  with 
details  of  the  surface  structure  of  the  passage.  Thus,  not  only 
is  what  is  said  in  the  passage  important,  but  also  how  it  is  said. 
The  experimental  results  cover  several  semantic  and  surface 
structure  properties  that  are  central  to  the  process  of 
identifying  important  content,  and  some  of  the  strategies  that 
readers  use.  Simulation  models  of  the  comprehension  and  main  idea 
identification  processes  were  developed  and  tested  against  actual 
reader  behavior.  These  models  represent  the  general  theoretical 
framework  in  a  highly  specific  way,  and  thus  summarize  the  major 
results  of  the  project. 
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This  is  the  final  report  for  a  project  concerned  with 
thematic  processes  in  the  comprehension  of  technical  prose. 
Thematic  processes  are  those  which  identify  or  derive  the 
important  content  in  a  piece  of  prose,  distinguishing  it  from  the 
details  or  irrelevancies.  Technical  prose  is  a  subtype  of 
expository  prose  that  is  concerned  with  presenting  information  of 
a  technical  nature.  The  appendix  in  this  report  lists  the 
reports,  publications,  and  presentations  resulting  from  the 
project  work. 

Technical  prose,  and  expository  prose  in  general,  has  not 
been  studied  by  cognitive  psychologists  as  heavily  as  story 
materials.  However,  understanding  how  people  comprehend  technical 
prose  is  of  immense  practical  importance  in  the  educational 
domain.  Most  textbooks  are  technical  prose.  They  present  densely 
packed  complex  information  that  is  usually  highly  novel  to  the 
reader.  A  second  area  where  technical  prose  is  important  is  in 
technical  documentation  such  as  instruction  or  maintenance 
manuals.  Expert  opinion  seems  to  be  that  technical  manuals  are 
not  very  comprehensible.  But  given  the  paucity  of  scientific 
knowledge  about  how  prose  of  this  type  is  understood,  it  is  hard 
for  any  agency  or  manufacturer  to  propose,  justify,  or  enforce 
substantial  standards. 

So  the  study  of  technical  prose  is  extremely  important  for 
practical  reasons.  It  is  also  important  for  scientific  reasons, 
in  that  technical  prose  could  have  its  own  distinctive  features. 
The  function  of  this  project  was  to  collect  a  set  of  results  on 
the  properties  of  technical  prose,  with  a  focus  on  the  thematic 
processes  by  which  a  reader  abstracts  the  gist,  or  important 
content,  from  technical  prose.  The  main  points  of  these  results 
were  summarized  in  the  form  of  computer  simulation  models  which 
were  tested  against  experimental  data.  Certain  methodological 
problems  in  both  the  experimentation  and  modelling  were  solved  in 
the  course  of  the  project.  A  more  detailed  and  complete  version 
of  this  summary  can  be  found  in  Kieras  (in  press-a) . 

E  Xh29Z§ti£2l  Framework 

The  schema  theory  of  comprehension  is  currently  very  popular 
as  an  explanation  for  many  of  the  features  of  prose  comprehension. 
But  as  argued  in  Kieras  (in  press-a),  it  seems  to  have  very  little 
applicability  to  the  comprehension  of  technical  prose.  A  more 
appropriate  theoretical  approach  would  be  one  emphasizing  those 
aspects  of  comprehension  that  work  at  the  level  of  processing 
individual  content  facts.  The  best  currently  available 
theoretical  framework  is  the  macrostructure  theory  developed  over 
the  last  decade  by  Kintsch  and  van  Dijk  (Kintsch,  1977;  van  Dijk, 
1977a, b,  1980;  Kintsch  &  van  Dijk,  1978).  This  framework,  with 
some  modifications,  was  used  in  the  project. 

The  macrostructure  theory  can  be  summarized  in  its  main 
content  very  briefly:  When  a  reader  comprehends  a  passage,  he  or 
she  first  extracts  the  microstructure  of  the  passage,  and  then 
applies  fflfl£ig.-jyl£5  to  derive  a  passage  macrostructure.  The 
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microstructure  represents  the  immediate  content  of  the  passage, 
while  the  macrostructure  represents  the  gist,  or  important 
content,  of  the  passage.  The  macrorules  essentially  "boil  down" 
the  large  number  of  micropropositions  to  the  relatively  small 
number  of  macropropositions.  The  macrostructure  propositions  are 
then  given  priority  for  storage  in  memory.  Upon  recall,  the 
macropropositions  are  expanded  to  produce  a  paraphrased,  and 
possibly  distorted,  version  of  the  original  passage. 

In  terms  of  the  macrostructure  theory,  the  process  of 
abstracting  the  thematic  content  is  the  process  of  building  the 
passage  macrostructure.  However,  to  address  this  process  more 
directly,  it  is  necessary  to  modify  the  macrostructure  framework, 
because  the  goal  of  the  Kintsch  and  van  Dijk  work  has  been  on 
explaining  the  properties  of  prose  recall,  and  not  explicating  the 
process  by  which  the  macrostructure  is  built.  As  a  result,  the 
macrostructure  building  process  has  been  studied  only  indirectly, 
with  the  mechanisms  of  memory  storage  and  retrieval  intervening. 
Also,  the  macro-rules  as  defined  thus  far  have  not  been  worked  out 
in  any  detail,  and,  more  importantly,  the  macro-rules  operate  only 
on  the  semantic,  or  propositional,  content  of  the  passage.  Other 
influences,  such  as  the  textual  surface  structure,  on  the 
macrostructure-building  process  need  to  be  included. 

The  framework  proposed  here  is  that  the 
macrostructure-building  process  uses  the  propositional  content  of 
the  passage  primarily,  but  is  guided  by  the  passage  surface 
structure.  In  particular,  there  seem  to  be  common  text  grammars 
which  specify  where  in  the  passage  important  information  is  likely 
to  appear,  and  there  are  several  surface-level  signals  that  mark 
individual  items  of  information  that  are  important  to  the  passage 
macrostructure.  Thus,  abstracting  the  main  content  from  a  passage 
depends  not  only  on  the  semantic  content  of  what  is  said,  but  also 
on  the  specifics  of  how  it  is  said,  both  at  the  level  of  the  whole 
passage,  and  at  the  level  of  individual  sentences.  Results  will 
be  summarized  below  that  support  this  point  of  view,  followed  by  a 
brief  description  of  a  simulation  model  that  illustrates  this  view 
of  the  macrostructure-building  process. 

Mfitiipsls  lax  studying  Thematic  EiQG£J>£££ 

The  results  summarized  below  concern  how  subjects  identify 
the  main  content  of  a  passage.  They  were  obtained  by  using  an 
experimental  task  that  is  substantially  different  from  the  usual 
recall  task.  The  subject  is  given  a  paragraph-length  passage  to 
read,  and  then  is  asked  to  provide  a  statement  of  the  important 
content  of  the  passage.  This  is  either  a  statement  of  the  ynain 
item,  which  is  required  to  be  a  title-like  noun  phrase  that 
indicates  what  the  passage  is  about,  or  a  statement  of  the  main 
idea,  which  is  a  brief  complete  sentence  that  states  the  point,  or 
main  idea,  of  the  passage. 

The  main  idea  or  main  item  responses  can  be  analyzed  for 
content,  and  then  examined  to  determine  what  content  of  the 
passage  is  being  used  to  produce  the  response.  A  simple  way  to 
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analyze  the  response  content  is  to  sort  the  responses  into  rough 
categories.  if  this  is  done  blind  to  a  within-passage 
manipulation,  the  results  will  be  reasonably  reliable.  However, 
there  is  no  way  to  compare  responses  obtained  for  completely 
different  passages  with  this  simple  method,  since  the  grain  of  the 
categories  can  not  be  controlled.  A  more  detailed  approach,  based 
on  a  propositional  analysis  of  the  responses,  is  described  in 
Bovair  and  Kieras  (Note  1) . 

Reading  times  for  the  entire  passage,  or  its  individual 
sentences,  can  be  collected  and  related  to  the  passage  structure 
and  content  of  the  responses.  Reading  times  can  reveal  changes  in 
the  amount  of  macroprocessing  required  by  a  passage  as  a  function 
of  manipulations  in  the  form  or  content  of  the  passage.  By  using 
a  sentence-at-a-time  procedure,  considerable  detail  can  be 
obtained  about  effects  of  passage  manipulations  on  reading  time. 

A  final  measure  used  in  the  project  work  is  importance 
ratings.  Subjects  engage  in  a  main  idea  response  task,  but  in 
addition,  they  rate  the  individual  passage  sentences  for 
importance  to  the  main  idea.  This  can  be  done  either  with  the 
entire  passage  present,  or  in  a  sentence-at-a-time  paradigm. 

B&Siilte 


Xh£  relation  a£  main  item  ate  main  ideas.  Theoretically, 
the  main  item  is  simply  the  most  important  referent  in  the 
passage,  whereas  the  main  idea  is  the  most  important  proposition. 
Presumably,  there  should  be  an  intimate  relationship  between  these 
two  response  forms,  in  Kieras  (Note  2)  subjects  generated  either 
main  idea  or  main  item  responses  for  paragraphs  taken  from 
Scientific  American  articles.  One  way  the  two  response  types  were 
related  was  that  popular  main  items  were  also  popular  surface 
subjects  of  main  idea  responses,  corresponding  to  the  theoretical 
intuition  that  main  ideas  are  about  main  items.  A  second  result 
was  that  producing  the  main  item  of  a  passage  is  much  easier  than 
producing  the  main  idea.  The  average  completion  time  per 
paragraph  for  the  main  idea  task  was  about  40  seconds  longer  than 
for  the  main  item  task.  While  subjects  had  to  write  more  in  the 
main  idea  task,  it  seems  unlikely  that  this  large  amount  of  time 
was  required  simply  to  write  a  sentence  as  opposed  to  a  noun 
phrase.  Rather,  the  additional  time  must  reflect  a  substantial 
difference  in  the  processes  involved.  Identifying  the  main 
referent  involves  simply  singling  out  the  main  argument  of  the 
mass  of  propositions,  whereas  finding  the  main  proposition  would 
require  finding  the  set  of  arguments  that  is  most  important,  and 
then  picking  the  most  important  relation  connecting  them.  A 
similar  idea  at  a  simpler  level  appears  in  the  results  of  Manelis 
(1980)  and  Kieras  (1978),  which  suggest  that  in  simple  passages 
thematic  content  responses  may  be  determined  by  which  proposition 
is  "central"  in  the  passage  structure. 


Global  coherence .  One  of  the  defining  features  of  a 
well-formed  passage  is  global  coherence ,  the  property  of  the 
passage  being  about  one  thing  (van  Dijk,  1979 ,  1980)  .  Kieras 
(1981a)  found  that  subjects  could  generate  main  item  statements 
for  passages  that  had  a  single  major  referent  much  faster  and  more 
consistently  than  for  passages  that  had  three  major  referents. 
While  this  is  a  simple  result,  examination  of  the  content  of  the 
responses  showed  that  when  there  is  a  single  major  referent, 
subjects  show  a  strong  tendency  to  simply  report  it  as  the  main 
item.  But  when  there  is  more  than  one  major  referent,  many 
readers  infer  another  referent  that  subsumes  the  three  that  were 
presented  in  the  passage.  Thus  readers  can  arrive  at  a  global 
topic  even  though  the  passage  does  not  have  an  obvious  explicit 
one.  This  extra  macro-level  processing  takes  additional  time,  and 
its  dependence  on  an  individual  reader's  idiosyncratic  inferences 
results  in  less  consistency  between  subjects. 

Different  types  of  macrostrupfcpre.  Passages  differ  in  the 
relationship  of  their  macrostructure  to  the  microstructure.  This 
issue  is  best  illustrated  by  referring  to  the  rules  that  van  Dijk 
(1977a,b,  1980)  proposed  for  the  construction  of  macrostructure. 
The  two  most  important  rules  are  (1)  Generalisation:  A  set  of 
propositions  consisting  of  instances  of  a  single  general  concept 
can  be  replaced  by  the  single  general  proposition;  (2) 
Const ruction- Integration:  A  series  of  propositions  can  be 
replaced  by  their  consequence.  For  example,  a  passage  describing 
the  history  of  the  Watergate  affair  can  be  summarized  by  the 
statement  Nixon  rpgifiped  bepapge  of  Watergate. 

Kieras  (Note  3)  examined  several  passages  whose  main  ideas 
were  based  on  either  the  generalization  rule  or  the 
construction-integration  rule.  The  conclusion  was  that  subjects 
were  faster,  and  more  consistent,  at  producing  main  idea 
statements  for  the  generalization  passages  than  for  the 
construction- integration  ones.  Moreover,  if  the  main  idea  was 
explicitly  stated  in  the  passage,  readers  were  generally  faster 
and  more  consistent  in  their  responding  than  if  it  were  absent, 
and  tended  to  reproduce  the  presented  main  idea  in  their 
responses.  But  this  effect  was  considerably  weaker  in  the 
construction-integration  passages,  suggesting  that  these 
macrostructures  were  considerably  more  difficult  to  identify  than 
the  generalization  structures. 

In  terms  of  macroprocessing,  in  the  generalization  passages, 
the  reader  must  simply  recognize  the  pattern  of  instances  of  the 
same  general  concept.  The  macrostructure  of  such  a  passage  is 
thus  a  rather  simple  single-layer  tree.  Providing  the 
generalization  explicitly  almost  guarantees  that  the  pattern  will 
be  recognized.  But,  in  the  construction- integration  passages,  the 
reader  must  be  able  to  deduce  or  recognize  the  chain  of 
antecedents  and  consequences  in  the  argument  being  presented,  or 
the  final  outcome  of  a  sequence  of  events.  This  reasoning  is  more 
complex  compared  to  the  generalization  passages,  and  so  is  slower, 
and  depends  more  on  the  idiosyncratic  knowledge  and  reasoning 
process  of  the  individual  subjects,  and  so  is  less  consistent. 


Even  an  explicit  main  idea  may  not  be  recognized  as  such  by  all 
subjects. 

Signelg  lex  Dasfl&fcjg  Cgntenfc 

Sentence  topic-comment  assignment .  Whether  a  referent 
appears  as  the  surface  subject  of  passage  sentences  affects  its 
thematic  importance,  since  this  syntactic  position  usually  carries 
a  marking  of  the  sentence  topic.  Perfetti  and  Goldman  (1974, 
1975)  found  that  readers  assign  the  topical  referent  of  a  passage 
to  the  surface  subject  position  of  a  sentence,  and  will  use  the 
passive  voice,  if  necessary,  to  do  so.  Van  Dijk  (1979)  also 
pointed  out  that  the  assignment  of  items  to  either  the  topic  or 
comment  position  in  a  sentence  will  be  determined  by  the  global 
topic  of  the  passage.  However,  Kieras  (1981a)  showed  that 
topic-comment  assignment  could  influence  main  item  responses  by 
using  passages  in  which  the  sentence  topic-comment  assignment 
could  be  reversed,  while  essentially  preserving  the  propositional 
content.  This  result  shows  that  sentence  surface  structure  can 
influence  the  macrostructure-building  process. 

Weak  thematic  markers.  If  the  main  idea  is  otherwise  clear, 
marking  it  may  have  little  or  no  effect.  Some  explicit  markers  of 
thematic  content  are  titles,  which  name  the  main  item  explicitly, 
and  marking  phrases,  such  as  the  important  point  is  that... .  which 
signal  important  propositions. 

Effects  of  such  markers  on  memory  for  passage  content  appears 
to  be  weak.  By  using  passages  with  two  possible  global  topics  and 
manipulating  the  title  of  the  passage,  Schallert  (1976)  showed 
effects  of  titles  on  recognition  memory,  but  not  recall,  and 
Kozminsky  (1977)  found  recall  effects  that  were  fairly  weak. 
Likewise,  the  reported  effects  of  marking  phrases  on  recall 
(Meyer,  1977)  also  appear  to  be  weak. 

These  markers  also  appear  to  be  weak  in  influencing  thematic 
responses.  In  unpublished  work,  Kieras  used  passages  that  had 
already  been  used  in  main  item  and  main  idea  tasks,  and  so  a 
strong  and  a  weak  main  idea  or  item  could  be  chosen  for  each  one. 
A  comparison  was  done  for  both  main  idea  and  main  item  response 
tasks,  using  titles  and  marking  phrases,  with  either  no  marker, 
marking  the  strong  idea  or  item,  or  marking  the  weak  item  or  idea. 
Although  other  effects  appeared,  such  as  the  initial  mention 
effect  (see  below),  no  effects  of  the  marking  condition  on  main 
idea  or  main  item  statements  appeared  at  all.  The  conclusion  is 
that  when  the  thematic  content  is  reasonably  clear,  the  reader 
considers  the  markers  as  redundant,  or  simply  ignores  them. 

Since  the  effectiveness  of  various  forms  of  emphasis  is  an 
important  practical  question  in  document  design  (cf.  Charrow  & 
Redish,  Note  4;  Swarts,  Flower,  &  Hayes,  Note  5),  further  study 
of  them  would  be  worthwhile.  But  it  could  be  that  their  effects, 
if  any,  are  transient.  That  is,  titles  or  marking  phrases  may 
influence  which  hypotheses  about  the  main  idea  are  considered 
before  the  final  result  is  arrived  at,  but  this  final  result  may 


Page  8 


not  reflect  the  markers  at  all.  Of  course,  if  the  material  were 
almost  incomprehensible,  the  reader  might  be  forced  to  rely  much 
more  heavily  on  these  markers.  But  notice  that  the  results 
described  above  were  obtained  using  technical  passages  which  were 
in  fact  very  unfamiliar  to  readers.  Apparently,  the  semantic 
content  of  the  passages  was  usable,  in  spite  of  its  unfamiliarity, 
and  dominated  the  surface-level  markers.  This  is  an  instance  of 
the  principle  of  "shallow  semantics"  discussed  below. 

The  Role  of  Text  Structure 

Location  ol  JjaecULfcaBJ;  jpjipripflfcipn.  At  the  level  of  text 
structure  the  concern  is  where  in  the  passage  the  important 
information  appears,  as  opposed  to  the  nature  of  the  information 
itself.  One  question  is  whether  there  are  standard  locations  for 
important  information. 

Kieras  (Note  6)  reported  one  study  in  which  subjects  produced 
main  idea  statements  for  naturally  occuring  paragraphs  from 
Scientific  American.  When  the  content  of  the  responses  was 
compared  to  the  original  paragraph  sentences,  it  appeared  that 
when  the  source  of  the  main  idea  could  be  assigned  to  a  single 
sentence,  this  source  was  mostly  the  beginning  of  the  passage,  and 
to  some  extent,  the  end,  forming  a  U-shaped  function  with  a  very 
high  peak  on  the  first  sentence.  Another  study  in  Kieras  (Note  6) 
had  subjects  underline  the  most  important  sentence  in  page-length 
passages.  Again  the  bulk  of  the  responses  were  on  sentences  that 
occured  first  or  early  in  the  passage,  then  with  another,  smaller 
peak  at  the  end  of  the  passage.  Finally,  Kieras  (Note  3,  in 
press-b;  Kieras  &  Bovair,  Note  7)  collected  importance  ratings 
for  individual  sentences  in  paragraphs  in  which  the  main  idea  was 
either  explicitly  presented  in  the  first  sentence,  or  was  absent. 
When  present,  this  initially  presented  main  idea  sentence  was  very 
heavily  chosen  as  the  most  important  sentence.  However,  with  some 
passages,  sentences  appearing  at  the  end  were  also  considered 
fairly  important,  especially  if  the  initial  main  idea  sentence  was 
missing. 

These  results  suggest  that  the  initial  position  in  a  passage 
is  the  most  popular  location  for  important  information.  However, 
a  second  location  is  the  end  of  the  passage.  As  suggested  in 
Kieras  (Note  3) ,  some  passages  appear  to  have  a  structure 
consisting  of  an  argument  leading  up  to  a  conclusion. 

The  £ilH££iPI}  initial  meptjpn.  The  results  summarized 
above,  along  with  theoretical  considerations  (e.g.,  Carpenter  & 
Just,  1977)  ,  suggest  that  the  initial  position  in  a  passage  is 
uniquely  important.  Serial  position  effects  in  prose  recall  have 
been  observed  (see  Meyer,  1977)  ,  but  these  have  usually  been 
attributed  to  the  fact  that  the  important  information  tends  to 
appear  first;  since  important  information  is  recalled  better, 
then  the  first  information  will  be  recalled  better  than  later 
information.  However,  the  initial  position  could  serve  to  mark 
information  as  important,  making  initially  mentioned  information 
thematically  important  to  some  extent  just  by  virtue  of  its 
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position. 

This  hypothesis  was  confirmed  by  the  studies  reported  in 
Kieras  (1980)/  using  the  main  idea  and  main  item  tasks.  The 
approach  was  to  keep  passage  content  constant/  and  only  vary  what 
was  marked  as  thematic  by  variations  in  order  of  mention.  The 
results  show  that  an  item  or  idea  is  considered  more  thematically 
important  if  it  is  mentioned  first  than  if  mentioned  later  in  the 
passage.  Hence  readers  appear  to  expect  the  main  idea  or  item  to 
appear  first  in  a  passage,  and  so  assign  thematic  importance  to 
first-appearing  items.  Thematic  effects  produced  by  initial 
mention  also  appear  in  recall,  and  cannot  be  attributed  to  simple 
serial  position  effects  (Kieras,  1981c). 

Accompanying  the  thematic  role  of  initial  mention,  there  is 
also  an  important  reading  time  effect  of  initial  mention.  The 
first  sentence  in  a  passage  is  usually  read  for  a  relatively  long 
time.  This  has  been  demonstrated  in  two  ways  (see  Kieras,  Note  8, 
1983b,  in  press-b;  Kieras  &  Bovair,  Note  7) .  The  first  is 
studies  manipulating  the  presence  or  position  of  an  explicit 
initial  main  idea  sentence  in  using  a  sentence-at-a-time  paradigm. 
The  sentence  that  follows  the  explicit  initial  main  idea  sentence 
is  either  the  second  sentence  in  the  passage  if  the  initial  main 
idea  is  present,  or  it  is  the  first  sentence  if  the  main  idea  is 
absent  or  elsewhere.  The  reading  time  on  this  sentence  is  longer 
if  it  appears  first  than  if  it  appears  second.  The  second 
demonstration  is  that  the  first  sentence  is  read  longer  than  would 
be  predicted  from  its  propositional  content  and  its  length.  That 
is,  using  either  a  statistical  model  for  sentence  reading  times, 
or  a  simulation  model  that  represents  parsing,  referential,  and 
representational  processes,  the  reading  time  on  individual 
sentences  can  be  predicted  (see  Kieras,  1981b,  in  preparation). 
Consistently,  the  first  sentence  in  a  passage  is  underpredicted  by 
such  variables;  significantly  better  fits  are  obtained  by 
including  a  variable  that  represents  that  a  sentence  occupies  the 
first  position.  With  the  passages  studied,  the  estimate  thus 
obtained  of  the  additional  time  required  on  the  first  sentence  is 
about  1  to  2  seconds. 

Hence,  the  first  sentence  of  a  passage  appears  to  require 
more  processing  than  one  would  expect  based  on  its  other 
properties.  The  model-based  assessment  rules  out  some  of  the 
simple  explanations,  such  as  the  need  to  define  many  new  referents 
in  the  first  sentence.  Rather,  there  seems  to  be  a  unique 
function  of  the  first  sentence,  perhaps  one  of  "setting  the 
stage",  or  preparing  the  comprehension  system  to  process  a  large 
body  of  information  about  a  certain  subject  matter.  If  the  first 
sentence  actually  contains  the  main  idea,  then  this  stage-setting 
function  will  be  maximally  successful. 

A  good  example  of  the  importance  of  a  main  idea  appearing  in 
the  initial  position  is  the  results  in  Kieras  (in  press-b;  Kieras 
&  Bovair,  Note  7) .  These  were  obtained  using  passages  based  on 
the  generalization  macro-rule  presented  in  a  sentence-at-a-time 
paradigm,  and  with  main  idea  responses,  reading  times,  importance 
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ratings,  and  think-aloud  protocols  being  collected.  In  these 
passages,  the  main  idea  was  a  generalization,  either  explicitly 
stated  in  the  initial  position,  or  absent,  and  the  body  of  the 
passage  consisted  of  a  series  of  instances  of  the  generalization 
with  some  irrelevant  information  present  as  well.  The  instances 
were  ordered  along  some  dimension,  such  as  in  chronological  order. 

As  mentioned  above,  these  passages  have  a  fairly  simple 
macrostructure.  But  the  inital  appearance  of  the  main  idea  can 
make  a  substantial  difference  in  the  processing  that  subjects  do 
while  reading  through  the  passage.  In  brief,  if  an  obvious 
candidate  for  the  main  idea  appears  first,  then  the  reader  need 
only  adopt  it,  and  then  test  it  for  adequacy  while  reading  the 
remainder  of  the  passage.  If  not,  the  reader  must  attempt  to 
formulate  a  main  idea  while  reading,  and  be  prepared  to 
re-formulate  it  whenever  a  poor  fit  is  noticed.  As  a  result,  some 
sentences  may  appear  important  when  first  read,  but  then  later 
turn  out  to  be  merely  details  or  irrelevant.  Thus,  the  initially 
presented  explicit  main  idea  "protects"  the  reader  against  the 
irrelevant  material  or  alternative  possible  main  ideas,  and  so 
simplifies  arriving  at  a  main  idea. 

fiilfllJL.oy  ££ffi£ILfci££ 

Macrostructure  processing  can  apparently  proceed  largely  on 
the  basis  of  only  limited,  or  shallow,  knowledge  of  the  semantics 
of  the  subject  matter.  Given  the  nature  of  the  technical  prose 
materials  studied  in  this  work,  one  would  suspect  that  the  typical 
college  student  would  be  completely  bewildered  by  the  subject 
matter,  and  so  be  forced  to  rely  almost  exclusively  on  surface 
level  cues  to  the  important  content.  However,  subjects  display  a 
marked  ability  to  comprehend  the  propositional  structure  of  a 
passage  at  a  shallow  level,  and  then  use  this  information  to 
identify  the  important  content.  This  level  of  comprehension 
actually  corresponds  very  closely  in  concept  to  the  process 
engaged  in  by  a  prose  researcher  of  constructing  a  propositional 
representation  of  a  passage.  Such  a  representation  has  many 
useful  properties,  such  as  its  connectivity  structure,  even  though 
it  does  not  represent  the  full  semantic  content  of  the  passage. 

Subjects  appear  to  be  able  to  use  the  shallow  semantic 
information  to  identify  the  main  content  even  when  the 
surface-level  markers  are  inconsistent,  or  when  the  content  is 
quite  unfamiliar.  The  basic  support  for  this  assertion  are  the 
following  observations;  (1)  Surface-level  markers  do  not  dominate 
subjects'  responses;  semantic  considerations  often  override  the 
surface  markers.  For  example,  in  Kieras  (1980),  initial  mention 
influences  the  choice  of  main  item,  but  usually  about  a  third  of 
the  responses  were  not  of  the  marked  item.  As  one  subject 
commented  during  debriefing,  of  course  the  topic  should  appear 
first,  but  sometimes  the  obvious  topic  was  elsewhere!  (2)  Even  in 
very  unfamiliar  material,  subjects  are  fairly  consistent  in 
assigning  importance  ratings  to  strongly  relevant  and  irrelevant 
sentences,  even  if  they  are  fairly  inconsistent  in  their  main  idea 
responses  (Kieras,  Note  3,  in  press-b;  Kieras  &  Bovair,  Note  7). 
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(3)  Subjects  can  use  some  of  the  easily  inferred  semantic 
properties  of  sentence  terms  to  relate  an  individual  sentence  to  a 
main  idea.  For  example,  in  an  experiment  using  think-aloud 
protocol  methods  (Kieras,  in  press-b;  Kieras  &  Bovair,  Note  7) ,  a 
subject  read  the  sentence  &  hydrogen  maser  clock  has  pico- second 
accuracy  for  million  years  in  a  passage  about  how  modern 
timekeeping  devices  are  extremely  accurate,  and  commented,  "I 
don't  know  what  a  hydrogen  maser  is,  and  I  don't  know  what  a 
pico-second  is,  but  this  is  obviously  a  clock  that  is  extremely 
accurate".  (4)  Subjects  can  use  simple  superordination 
relationships  presented  in  a  passage  to  choose  main  items,  even 
when  the  terms  are  novel.  For  example,  in  Kieras  (1980,  1981c),  a 
passage  was  used  in  which  biotransf ormation  was  described  as  a 
general  process,  and  the  liver  was  introduced  as  an  organ  that 
performs  biotransformation,  and  then  further  described.  Subjects 
showed  a  very  strong  tendency  to  prefer  the  general,  but 
unfamiliar  term,  as  the  passage  topic,  even  though  propositions 
about  the  liver  were  much  better  recalled. 


Thus,  it  appears  that  at  least  a  major  portion  of 
macrostructure  processing  can  go  on  without  full  or  deep 
understanding  of  the  passage  content.  Hence,  the  role  of  general 
knowledge  in  these  tasks  is  relatively  limited;  when  subjects  can 
pick  out  central  propositions  using  only  superficial  or  simple 
semantic  relations,  more  detailed  knowledge  is  not  necessary. 
Note  that  the  college  students  used  as  subjects  in  this  and  most 
prose  research  probably  have  developed  during  their  long  history 
of  schooling  a  specialized  skill  for  dealing  with  complex  verbal 
material  without  fully  understanding  it. 

Thematic  content  is  specified  by  a  combination  of  information 
at  different  levels  and  of  different  types.  Overall,  it  seems 
that  the  most  important  information  source  is  the  propositional 
structure  of  the  passage  content.  In  the  discussion  above  of 
shallow  semantics,  it  was  pointed  out  how  people  can  deal  with 
complex  technical  prose  material  without  full  comprehension  of  it. 
Apparently  they  make  use  of  the  superficial  characteristics  of  the 
semantics  and  the  propositional  structure. 

That  the  propositional  and  semantic  content  of  a  passage 
would  be  the  most  important  determinant  of  macrostructure  is  in 
the  spirit  of  the  original  macrostructure  theory.  But  a  major 
modification  of  the  theory  is  that  surface  level  features,  both  of 
individual  sentences,  and  the  passage  as  a  whole,  also  influence 
the  macrostructure-building  process.  This  is  an  important  point. 
Ever  since  Sachs's  (1967)  paradigmatic  study  showing  the  apparent 
unimportance  of  surface  form,  the  cognitive  psychology  of 
comprehension  has  tended  to  ignore  surface  structure  in  favor  of 
semantic  content.  However,  it  seems  clear  that  surface  structure 
is  normally  chosen  by  the  writer  in  an  attempt  to  convey  a  desired 
meaning  most  efficiently.  The  reader  is  expecting  these 
conventional  uses  of  surface  structure,  and  so  bases  his  or  her 
meaning  interpretation  on  them  to  some  extent.  Hence  an  adequate 


theory  of  comprehension  must  explain  not  only  how  readers  derive 
the  semantic  content  of  sentences  and  relate  them  to  already  known 
information,  but  also  how  the  surface  form  of  the  input  is  used  to 
guide  or  streamline  this  process. 

A  model  conforming  to  this  theoretical  approach  is  reported 
in  Kieras  (in  press-b;  Kieras  &  Bovair,  Note  7),  and  will  be 
briefly  described  here.  Although  the  model  is  rather  limited, 
having  been  applied  only  to  the  generalization  passages  described 
above,  it  does  illustrate  the  principle  that  main  ideas  can  be 
derived  with  only  shallow  semantic  knowledge,  and  through  the  use 
of  textual  and  sentence  surface  structure  as  well  as  the 
propositional  content. 

The  model  takes  the  propositional ized  form  of  the  passage  as 
input,  and  processes  it  one  sentence  at  a  time.  It  sets  up  and 
maintains  a  hypothesized  or  candidate  main  idea  for  the  passage, 
and  may  modify  this  in  the  course  of  processing  the  passage.  The 
final  candidate  is  then  reported  as  the  main  idea.  For 
simplicity,  the  model  uses  only  a  single  proposition  as  its  main 
idea,  which  is  supposed  to  correspond  to  the  main  proposition  of  a 
subject's  main  idea  response. 

The  model  adopts  a  candidate  main  idea  usually  after  reading 
the  first  sentence,  and  then  tests  each  succeeding  sentence  for 
being  an  instance  of  the  main  idea  generalization.  If  so,  the 
model  proceeds  to  the  next  sentence.  But  if  not,  the  model  may, 
depending  on  a  decision  rule,  compute  a  new  candidate  main  idea. 
Different  rules  for  this  decision  to  revise  the  main  idea  are 
possible.  For  example.,  a  revision  attempt  is  indicated  if  a  large 
sentence  that  is  unrelated  to  the  current  main  idea  is 
encountered,  or  the  model  has  accumulated  more  propositions  that 
are  unrelated  to  the  main  idea  than  are  related. 

The  model  implements  van  Dijk’s  generalization  macro-rule 
with  rules  for  summarizing  a  set  of  specific  propositions  with  a 
general  one,  basically  by  finding  sets  of  propositions  whose 
arguments  are  concepts  that  share  supersets.  Thus,  the  general 
knowledge  required  by  the  model  is  quite  limited,  consisting  of 
little  more  than  set-superset  relations  for  the  arguments.  The 
fact  that  so  little  knowledge  is  required  may  explain  much  of  the 
shallow  semantics  phenomena  described  above. 

But  detailed  general  knowledge  seems  to  be  quite  important  in 
the  micro-level  inferential  process  required  before  the 
macro-level  processing  can  be  applied.  For  example,  the  sentence 
The  iiseil  bxonze  swords  can  not  be  related  to  the  main 
idea  cultures  u&£  until  an  inference  is  made.  The  model 
performs  this  inference  with  a  rule  in  general  knowledge  that  if  X 
uses  Y,  and  Y  is  made  of  Z,  then  X  uses  Z.  This  rule  yields 
Hellenes  n£££!  bxonze.  which  can  be  related  to  the  main  idea. 
Hence,  before  the  macroprocesses  can  work,  the  micro-level 
elaboration  and  inferences  must  be  done.  Intuitively,  such 
inferences  should  be  driven  by  the  current  hypothesized  main  idea. 
But  the  model  simply  makes  all  inferences  that  its  knowledge 
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allows  before  proceeding,  a  simple,  but  rather  inefficient  and 
unrealistic  process. 

The  model  relies  heavily  on  the  surface  structure  of  the 
passage  and  the  sentences  in  arriving  at  a  main  idea.  At  the 
sentence  level,  the  model  makes  use  of  the  main  proposition  in  a 
sentence  more  than  the  others,  because  the  main  propositions  are 
usually  more  relevant  to  the  global  main  idea.  The  designation  of 
a  proposition  as  a  main  proposition  is  done  on  the  basis  of  the 
sentence  surface  structure;  it  is  the  one  representing  the  main 
verb  relating  the  surface  subject  to  the  surface  object. 

The  role  of  textual  surface  structure  is  very  important  in 
that  the  initial  mention  convention  is  a  central  part  of  the 
model's  processes.  If  the  model  determines  that  the  first 
sentence  main  proposition  contains  general  terms,  it  concludes 
that  the  first  sentence  contains  a  candidate  main  idea,  and  so 
adopts  the  main  proposition  as  its  first  main  idea.  It  then 
selects  a  relatively  conservative  criterion  for  deciding  when  to 
revise  the  main  idea  on  subsequent  sentences.  Thus,  if  a  passage 
has  an  explicit  first  sentence  main  idea,  the  model  adopts  it,  and 
will  keep  it  unless  it  encounters  a  severe  degree  of  inconsistent 
information  in  the  remainder  of  the  passage.  This  corresponds  to 
the  general  result  in  Kieras  (in  press-b;  Kieras  &  Bovair,  Note 
7)  that  the  first  sentence  main  idea  is  usually  produced  as  the 
main  idea  response,  with  few  revisions  occuring  along  the  way. 

In  contrast,  if  the  first  sentence  is  not  general,  the  model 
either  generalizes  the  first  sentence  to  get  a  candidate  main 
idea,  or  waits  until  the  second  sentence  is  processed,  and  then 
generates  a  main  idea.  These  two  strategies  were  observed  in  the 
think-aloud  protocols  reported  in  Kieras  (in  press-b;  Kieras  & 
Bovair,  Note  7).  The  model  then  selects  a  liberal,  or 
"hair-trigger,"  criterion  for  revising  the  main  idea.  Since  the 
model  had  to  "guess"  a  main  idea,  it  must  be  prepared  to  abandon 
its  initial  guess  quickly  in  favor  of  another.  As  observed  in 
human  readers,  the  result  is  that  when  the  main  idea  is  not 
explicitly  stated,  the  model  changes  its  mind  relatively  often. 

£.0J3£lUgj.gB 

The  model  represents  the  combination  of  the  use  of  both 
surface  and  semantic  information  in  arriving  at  a  main  idea.  The 
surface  information  acts  to  guide  the  process  by  which  the 
semantic  content  is  used.  In  Kieras  (in  press-b;  Kieras  & 
Bovair,  Note  7)  more  detail  is  provided  on  how  the  model  conforms 
to  human  subjects  in  terms  of  reading  times,  importance  ratings, 
and  think-aloud  protocols.  While  the  model  has  some  serious 
problems  and  limitations,  its  overall  performance  is  encouraging. 
Hence  the  model  performs  its  function  of  summarizing  the  major 
features  of  how  people  abstract  main  ideas  from  technical  prose. 
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